ijf  / 770-7?-*/ 

(')]  DEVELOPMENT  OF  A PERFORMANCE  EVALUATION  _TEST  FOR  ENVIRONMENTAL  RESEARCH  (PETER) : 


O) 


ZD 

ZD 


f; 


o 

CJ5 


CRITICAL  TRACKING  TEST 


Presented  at  the  50th  Annual  Meeting  of  the  Aerospace  Medical  Association 

Washington,  DC,  May,  1979 


v: 


J <■///> [ 


r 


'rZ  1 ) 


Diane  L.  Damos,  Ph.D. 

Department  of  Industrial  Engineering 
State  University  of  New  York  at  Buffalo,  New  York 

Du^e.  J_,.  /V 

Robert  S J Kennedy^ 


to* 


Alvah  C./ Bittner  Jr«^^h.D. 


Naval  Aerospace  Medical  Research  Laboratory  Detachment 
New  Orleans,  Louisiana 


/ 


(3  1 1 ±irJ  / 


j 


NAVAL  AEROSPACE  MEDICAL  RESEARCH  LABORATORY 
NEW  ORLEANS,  LOUISIANA 


The  opinions  are  those  of  the  author  and  do  not  necessarily  reflect  those  of 
the  Department  of  the  Navy. 


D D C 

ir?rai?nn  ni? 


OBTHIBUnON  STATEMENT  A 


ved  for  poblls  release] 
Uattmlted 


APR  3 1879 

lECSEUTTE 


B 


3U 


‘9  03  30  023  ^ 


> 

»*• 


M 


Development  of  a Performance  Evaluation  Teat  for  Environmental  Reaearch  (PETER): 

Diane  Damoa,  State  Unlveralty  of  New  York  at  Buffalo 
Robert  S.  Kennedy  and  Alvah  C.  Bittner  Jr. 

Naval  Aerospace  Medical  Reaearch  Laboratory  Detachment,  New  Orleana,  LA 


Critical  Tracking 


A need  exists  for  a standardized  performance 
test  battery  to  study  the  effects  of  unusual 
environments  which  may  be  encountered  by  Navy 
personnel.  Such  a teat  battery  must  be  suffi- 
ciently sensitive  so  that  the  deleterious  effects 
of  these  exotic  environments  can  be  identified. 
The  concern  at  the  Naval  Aerospace  Medical  Re- 
search Laboratory  (NAMRL)  Is  primarily  with 
Inertial  environments.  Of  particular  Interest 
from  a performance  standpoint  are  the  very  low 
(OHz)  frequency  motions  which  occasion  seasick- 
ness and  the  higher  (^lHz)  vibrations  which  have 
blodynamlc  effects.  A research  program  Is  under- 
way to  develop  a performance  test  battery  with 
early  emphasis  on  tests  which  tap  Information 
processing,  cognitive  and  perceptual  functions. 
The  general  plan  for  the  development  of  the 
Performance  Evaluation  Test  for  Environmental 
Research  (PETER)  is  discussed  elsewhere  (5).  The 
present  study  Is  the  second  in  a series~wtilch 
report  the  results  of  evaluations  of  various 
tests.  The  Critical  Tracking  Test  selected  for 
study  (3),  differs  somewhat  from  other  tracking 
tasks  in  that  It  requires  the  operator  to  stabi- 
lize an  unstable  control  element.  The  degree  of 
Instability  of  the  control  element  is  represented 
by  a variable  \ , which  Is  the  sum  of  the  effec- 
tive time  delays  of  the  display,  and  operator 
perceptual  processing,  neural  transport,  and 
neuromuscular  delays.  The  value  of  A.  at  which 
the  operator  can  barely  control  the  system  is 
used  as  a dependent  measure  of  his  performance. 
The  purpose  of  the  study  described  below  was  to 
obtain  baseline  measures  of  performance  on  a 
Critical  Tracking  Task  to  ascertain  how  much 
baseline  pretesting  is  required  for  stability. 

Method:  The  Critical  Tracking  Task  with 
autopacer  used  In  this  experiment  was  Instrument- 
ed to  replicate  that  of  Jex,  McDonnell  and  Phatak 
(3).  The  function  of  the  autopacer  was  initially 
to  Increase  the  degree  of  Instability  quickly  and 
then  more  slowly  as  the  operator's  control  limits 
were  approached.  The  only  Input  to  this  system 
was  the  operator's  remnant.  The  task  was  dis- 
played on  a 14  cm,  circular  CRT  with  two  sets  of 
brackets  painted  on  Its  surface.  The  first  set 
of  brackets  was  separated  by  0.8  cm,  representing 
the  "good  performance”  range.  Subjects  were 
Informed  that  the  best  performance  resulted  when 
the  cursor  was  kept  within  these  brackets.  The 
large  brackets,  which  were  separated  by  8.0  cm 
denoted  the  range  outside  of  which  control  was 
lost  easily.  The  cursor  moved  only  in  the  verti- 
cal dimension  and  was  controlled  by  compatible 
forward  -backward  movements  of  an  Isometric 
control  stick.  The  trial  ended  when  the  subject 
lost  control  of  the  cursor  and  It  reached  the 
edge  of  the  display.  With  no  human  controller, k 
“.84.  The  Isometric  control  stick  was  inserted 
In  a table  top  slightly  to  the  right  of  the 
display.  The  task  logic  was  progrmmeed  on  an  EA1 
PACE  TR  48  Analog  Computer.  The  experimenter 
read  the  trial  value  of  \ from  a digital  die- 


read  the  trial  value  of  k from  a digital  display. 

A repeated  measures  design  was  used.  Each  subject 
received  15  trials  on  each  of  15  consecutive  week- 
days. Eighteen  Navy  enlisted  men  between  the  ages 
of  19  and  24,  With  20/20  corrected  vision,  partlci 
lpated  In  the  experiment.  All  subjects  were 
volunteers  recruited,  evaluated  and  employed  In 
accordance  with  Secretary  of  the  Navy  Instruction 
3900.39  and  Bureau  of  Medicine  and  Surgery  In- 
struction 3900^. 

b:  The 


Results: 


he  overall  Impression  Is  of  a 


learning  curve  which  may  not  be  level  by  Day  15  < 
The  means  range  from  4.38  to  6.75 
while  the  standard  deviations  vary  from  .62  to 
.95.  An  analysis  of  variance  conducted  on  the 
data  revealed  practice  and  subjects  effects  , 

01 i The  correlations  deserve  special  men- 
tion: The  average  of  the  test-retest  correlations 
for  Days  1 - 5 Is  .639;  for  Days  6 - 10,  .767;  and 
for  Days  11  - 15,  .831.  Figure  2 shows  corre- 
lation coefficients  of  Days  1,  2,  4,  9 and  13  with 
the  days  which  followed.  A generally  declining 
function  Is  evident  which  Is  similar  to  those 
previously  reported,  e.g.,  by  Jones  44^. 

Discussion:  The  majority  of  exotic  environ- 
ment experiments  have  been  concerned  with  dmon- 
strating  that  a given  taak  Is  sensitive  to  the 
environment  under  consideration.  The  rationale 
for  determining  the  sensitivity  of  a given  taak 
follows  the  form  of  "Student's  _t-Tes t"  for  repeat- 
ed measures: 

X 

(1)  t - 


*.  - 


(SD2  ♦ SD2  - 2r  SD  SD  )/N 
e c ec  e c 


where  X and  X are  the  respective  mean  perfor- 
mances for  thecexperlmental  and  control  condi- 
tions, SD  and  SD  are  the  respective  standard 
deviation!,  r lS  the  correlation  between  scores 
In  two  condltfSns,  and  N the  number  of  subjects. 
Generally,  the  experimenter  attempts  to  stabilise 
performance  on  the  task  before  the  subject  le 
exposed  to  the  exotic  environment.  Thus,  f 
represente  a baseline  for  determining  performance 
changes  Induced  by  the  environment  and  is  often 
obtained  after  several  practice  seeelona.  A 
statistical  difference  between  the  baseline  mean 
and  measures  obtained  during  the  experimental 
condition,  X would  Indicate  that  the  task  Is 
sensitive  to  the  environment.  More  sophisticated 
statistical  treatments,  idilch  uae  several  pro-, 
per-  and  post-  measures,  generalize  on  this  ap- 
proach. In  most  of  these  approaches  symmetry  of 
the  correlations  Is  required  l.e.,  the  correl- 
ations between  all  sesalons  must  be  equal.  This 
assumption  is  not  ordinarily  met  and  will  be 
discussed  below.  Although  practice  to  a baseline 
Is  the  most  often  used  approach,  determining  the 
point  at  which  performance  has  "stabilized"  is 
frequently  difficult.  This  may  be  becauee  perfor- 
mance Invariably  continues  to  Improve,  end  an 
asymptote,  irtille  expected,  Is  frequently  difficult 


o r\ 


AO 


...  1 " 


} 


I 


to  approach.  Also,  changes  In  motivation  and 
fatigue  can  obscure  the  approach  to  asymptote  (1). 
Therefore,  It  Is  sometimes  Impractical  to  obtain  a 
baseline  which  Is  stable  enough  to  detect  changes 
In  performance  Induced  by  the  experimental  treat- 
ment. This  problem  was  apparent  In  Figure  1 where 
asymptotic  performances  were  not  attained  until 
well  Into  the  third  week,  If  at  all.  Throughout 
approximately  180  trials  the  mean  values  continued 
to  Increase.  Although  the  terminal  value  la 
similar  to  that  reported  by  Jex,  et  al  (3),  per- 
formance may  have  reached  only  a temporary  pla- 
teau. Hence  changes  In  performance  Induced  by  an 
exotic  environment  might  be  obscured  by  learning. 
To  circumvent  the  problem  associated  with  an 
unstable  baseline,  some  have  attonpted  to  improve 
the  sensitivity  of  the  ^-test  by  addressing  ele- 
ments (SDc,  SDe>  rg  4 N)  of  the  denominator  (Eq 
1).  For  ixotic  environments,  however,  there  are 
practical  limitations  to  Increasing  sample  size 
and  so  repeated  measures  designs  have  been  employ- 
ed In  order  to  control  variability,  l.e.  the 
denominator  of  (Eq  1).  Unwanted,  although  usually 
Inherent,  sequence  effects  of  different  sorts 
(e.g.,  factor  structure  changes  (2)  often  makes 
this  approach  untenable.  Moreover,  SD  often 
changes  as  X increases  (4)  when  learning  occurs, 
and  large  changes  In  SD  can  result  In  lowered 
reliability.  The  last  element  to  be  addressed  Is 
the  "sustained"  reliability  of  the  retest.  It  Is 
felt  that  this  is  a frequently  overlooked  statis- 
tic at  least  from  the  standpoint  of  Improving  the 
precision  of  a performance  test  In  an  unusual  en- 
vironment. Those  who  have  studied  how  £ changes 
with  repetitions  were  more  Interested  in  the 
subject  of  skill  acquisition,  but  the  findings  are 
directly  relevent  to  PETER  and  may  prove  very 
useful  In  Interpreting  the  findings.  For  example. 
Days  11  - 13  had  an  average  correlation  of  .831. 
This  value  may  Indicate  that  the  test-retest 
reliability  of  the  critical  tracking  task  Is 
sufficiently  high  to  provide  a sensitive  t-test 
even  though  the  baseline  Is  not  stable  and  the 
variances  are  not  particularly  small.  Concerning 
the  reliability,  however,  an  Inspection  of  Figure 
2 shows  that  while  reliabilities  are  generally 
high,  they  decline  as  a function  of  remoteness 
from  the  trial  with  which  they  are  compared.  That 
Is,  retest  reliabilities  are  poorer  with  repeated 
trials.  However,  the  overall  reliability  of 
performance  after  day  4 appears  to  be  substantial- 
ly higher  than  for  previous  days  suggesting  that 
about  5 days  practice  (75  trials)  should  be  used 
to  attain  stability.  In  summary,  many  tasks  have 
not  been  considered  for  use  In  exotic  environment 
experiments  because  their  performance  does  not 
stabilize  without  extensive  practice.  Although 
performance  on  the  critical  tracking  task  Improved 
substantially  during  approximately  180  trials,  the 
test-retest  reliabilities,  while  declining  with 
Increasing  trial  differences,  were  reasonably 
high.  This  indicates  that,  while  the  power  of  a 
statistical  comparison  (e.g.,  ^-test)  may  be 
sufficient  to  detect  performance  differences 
induced  by  an  exotic  environment  despite  the 
changing  baseline,  too  many  sessions  (or  trials) 
may  attenuate  the  power.  It  further  suggests  that 
the  nature  of  what  Is  being  measured  by  the 


critical  task  may  be  changing , la  "factor  struc- 
ture". It  appears  that  close  attentloa  oust  be 
paid  to  the  test-retast  reliabilities  la  addition 
to  the  stability  of  the  baseline  and  the  size  of 
the  variance  In  selecting  battery  tasks.  From 
considerations  of  test-retast  reliability.  It  may 
be  found  that  tests  that  usually  would  be  ex- 
cluded from  an  exotic  environment  experiment 
would  be  most  sensitive  to  detection  of  environ- 
mental effects.  Moreover,  Inspection  of  riiere 
the  reliability  reaches  Its  highest  and  most 
stable  point  may  be  employed  as  a criterion  for 
how  much  pretesting  Is  required. 

References: 

1.  Bradley,  J.  V.,  1969.  Practice  to  an  asymp- 
tote. Journal  of  Motor  Behavior.  2/  285-295. 

2.  Fleishman,  E.  A.,  1960.  Abilities  of  differ- 
ent stages  of  practice  In  rotary  pursuit  perfor- 
mance. Journal  of  Experimental  Psychology.  60: 
162-171. 

3.  Jex,  H.  R. , McDonnell,  J.  D.  and  Phatak,  A. 
V.,  1966.  A "critical"  tracking  task  for  manual 
control  research.  IEE  Transactions  on 

Human  Factors  In  Electronics.  HFE-7 : 138-145. 

Jones,  M.  B.,  1969.  Principles  of  skill 
acquisition  -Differential  processes  In 
acquisition.  New  York:  Academic  Press, 

5.  Kennedy,  R.  S.,  and  Bittner,  A.  C.,  Jr., 

1977.  The  development  of  a Navy  Performance 
Evaluation  Test  for  Environmental  Research 
(PETER)  Productivity  Enhancement:  Personnel 
Performance  Assessment  in  Navy  Systems,  Naval 
Personnel  Research  and  Development  Center,  San 
Diego,  CA.  Oct.  1977,  (AD  A056047) . 


i 


! 


r- 

J 

■ t 


‘ 

p 


